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ABSTRACT 


Information specialists who work in any of the related 
fields of health science will often seek drug information. 

One ot the biggest problems in retrieving drug information is 
the number of ways a drug can be described (variations of drug 
nomenclature). This study examined the indexing of thirty 
selected drugs in four online databases (Analytical Abstracts. 
BIOSIS PREVIEWS, Pharmaceutical News Index, and. SCISEARCH). 
The thirty drugs were first searched against two dictionary 
files (CHEMNAME and THE MERCK INDEX ONLINE) to identify all 
associated names and synonyms. Each term thus identified was 
then searched in each of the four databases. The search re- 
sults are analyzed by indexed terms and compared between 

each database. 

The study is intended to aid searchers improve recall and 
comprehensiveness when searching for drug information by iden- 
tifying the most useful search terms or search term combina- 
tions. The study seeks to answer the questions, "Is recall 
improved by searching: all available nonproprietary names?; 
all available proprietary names?; and/or a combination of non- 
proprietary and proprietary names?" 

The study results underscore the need for a basic under- 
standing of pharmaceutical nomenclature, effective use of 
chemical dictionary files. awareness of indexing differences 
among databases, and a well-planned search strategy but flexi- 
bility to make changes as necessary. 
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1. The Problem 
1.1 Introduction 

Information specialists who work in any of the related 
fields of health science will often seek drug information. 
Depending on the nature of the search question a variety of 
databases may be consulted ranging from general science to 
chemistry. microbiology, physiology, various medical special- 
ties. and business and legal regulatory sources. To ensure 
comprehensiveness, it is usually necessary to carefully search 
more than one database and the searcher must be fully cogni- 
zant of differences in indexing practices among them. 

One of the biggest problems in seeking drug information 
is drug nomenclature because no single universal name exists 
for a chemical (Snow 1989). Each source will specify its 
preferred nomenclature in indexing chemical substances. But 
when the preferred term is not available. the searcher must 
determine what alternate terminology is used. To find infor- 
Mation about a specific drug. it is usually necessary to search 
a range of possible names. The non-subject expert or inexperi- 
enced searcher can encounter numerous pitfalls such as inconsis- 
tent indexing, nomenclature variations, and varying indexing 
policies between databases. 


Complicating the problem of variant terminology is that 
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clinical literature is indexed differently from chemical liter- 
ature which is indexed differently than the pharmaceutical 
business literature. <A search using only one or two common 
names may result in low recall and much missed information. 
User satisfaction is not a measure of adequate retrieval since 
Many users will be unaware of what they are missing. 

A careful analysis of the indexing practices of databases 
without controlled indexing (e.g., thesauri and other search 
aids) is necessary to aid in developing more effective search 
strategies. Particularly with the rising costs of online 
searching, a search involving too many terms (or too many un- 
likely terms) will add excessively to the cost of a search 


without improving recall. 


1.2 Statement of the Problem 
This study examines the indexing of drugs in the litera- 
ture and compares actual drug indexing to stated indexing 


policies in selected databases. The goal is to aid health 


science information specialists, end-users, and/or non-subject 
xperts to improve recall and comprehensiveness when searching 
for drug information by identifying the most useful search 
terms (or search term combinations) when seeking information 
about a drug. 

The study seeks to answer the questions, "Is recall 
improved by searching: 


- all available nonproprietary names? 


= 3 = 
- all available proprietary names? 


- a combination of nonproprietary and proprietary names? 


1.3 Limitations 

This is a small-scale study which focuses on four repre- 
sentative databases (and those only on the Dialog Information 
Services System) and in no way can these results be considered 
definitive for all databases. Thirty drugs (of thousands 
available) have been tested in this study. It was hoped these 
thirty would be representative of drugs in general though may 
in fact present a biased set. Finally, results for comparison 
questions (one term vs. all terms) were calculated on an addi- 
tive basis without removing duplicates and the single term used 
for comparison was selected arbitrarily (subjective assessment 


of "most common" term in U.S.). 


1.4 Definitions 

For this study, the term ‘drug' has been defined as "any 
chemical compound that may be used as an aid in the diagnosis, 
treatment, or prevention of disease or for any other thera- 
peutic purpose." Th term ‘drug nomenclature’ refers to the 
system of terms used in the science of pharmacology. 

Goodman and Gilman (1980) define pharmacology as encom- 
passing “the knowledge of the history, source, physical and 
chemical properties, compounding, biochemical and physiological 
effects, mechanisms of action, absorption, distribution, bio- 


transformation and excretion. and therapeutic and other uses 


| 
> 


eae 28 


in Rs lt iia er arene Sac aa a a ad ah : ial aia F ear Hie ae ee 


<4 


of drugs." This description makes it clear that the subject of 


pharmacology (the science of drugs) is quite extensive. 
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2. LITERATURE REVIEW 


Many drugs. particularly drugs that have been available 
for several years, have numerous synonyms, trade names, and 
other means of identification. McGowan and Mater (1985/86) 
describe a method for identifying chemicals and drugs using a 
variety of tools such as the Physicians' Desk Reference, The 
Merck Index, and the Chemical Abstracts Index Guide in addition 
to online chemical dictionaries. They emphasize the usefulness 
of the Chemical Abstracts Service (CAS) registry number as a 
simple search strategy in databases that include registry num- 
bers. They also point out differences in assigned chemical 
names and give an example of the same drug identified in Chemi- 
cal Abstracts (parent group name followed by substituents in 
order of importance) and in The Merck Index (chemicals listed 
in non-inverted order; substituent groups followed by parent 
group name). They continue on to discuss effective use of on- 
line chemical dictionary files and search strategies when little 
is known about a substance. 

Bronson (1992) elaborates further on using two chemical 
dictionary files, CHEMLINE and CHEMID, to find information about 
drugs. By providing more search terms, these files are useful 
to develop more comprehensive strategies. CHEMLINE and CHEMID 


provide current and superceded CAS registry numbers and up to 
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200 synonyms per chemical. These files also allow name frag- 


ment searches and molecular formula searches. 


Deaves and Pache (1989) in their article, "Chemical and 
Numericat Indexing for the INSPEC Database." discuss diffi- 


culties of searching for chemical data online. Although they 


focus on inorganic substances, the difficulties apply equally 


to organic substances. For example, there can be different 


but very common formulae for the same substance. Hyphens can 


nave different meanings in different contexts and formulae with 


subscripts can be difficult to search online. 


Roth (1985) discusses pitfalls of chemical literature 
searching such as inconsistent indexing, nomenclature variations, 
language, and transliteration. These present problems for the 
non-subject expert or inexperienced searcher in particular. 

He presents an excellent Literature review of search problems 


such as: 


indexing services covering the same subject 


(supposedly comprehensively) but varying 
in recall 


the expense of searching online nomenclature files 

- the evolving nature of nomenclature 
Comprehensive searching "depends on carefully searching a wide 
range of publications and/or databases." He concluded with 
examples of questions that are most dangerous for inexperi- 
enced searchers or those without subject expertise and most 


likely to consume hundreds of dollars yet yield unsatisfactory 


results. 
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John Barber, et al., (1988) in “Case Studies of the Index- 
ing and Retrieval of Pharmacology Papers." presents a detailed 
analysis of the coverage and indexing of thirty papers on 
pharmacological topics and concluded there was considerable 
variation in the indexing applied with drur formulations. While 
the study did not focus specifically on drug nomenclature, 
examples of variant indexing of drugs were included. 

Many papers describe and discuss problems of drug and/or 
chemical searching in the literature but do not systematically 
study the problem. Dwight Tousignaut (1982) in "Searching 
‘Pharmacy' Databases: Nomenclature Problems and Inconsistencies." 
points out such problems as: inconsistencies even in “standar- 
dized" nomenclature schemes; drug names that vary from country 
to country; errors in source articles that lead to errors in 
the secondary literature; and, confusion added when manufactur- 
ers use the same trade name in more than one country but with 
different formulations of the two. Such inconsistencies are 
not likely to be identified by the non-subject expert searcher. 

In "Indexing: Old Methods, New Concepts," Tousignaut (1987) 
compares traditional indexing to a concept indexing scheme 
created as a result of developing Drug Information Fulltext. 

His conclusion was that "the future of fulltext will depend on 
controlled indexing approaches that offer easy access and depen- 
dable results.” 


Bonnie Snow (1982) in "Trade Names in Medicine: Searching 
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for Brand Nane Comparisons and New Product News," discusses the 
need to consult search aid databases for references to alter- 
nate uames for trade names. 


Snow has written extensively about the general topic of 


searching for drug information. In her book, Drug Information: 


A Guide to Current Resources (1989), she devotes one chapter to 
"Drug Nomenclature" which is an excellent description and defi- 
nition of the wide range of drug names that exist. In another 
Cuapter she describes "Identification and Nomenclature Sources" 
as aids in finding alternate names for particular drugs. 

In the chapter on "Abstracting and Indexing Services," 
Snow discusses in detail several selected online bibliographic 
databases useful for the pharmaceutical searcher and describes 
the chemical indexing policies of each. As she points out, her 
guide cannot provide more than an overview of each database and 
general statements about the indexing policies. A typical 
example is the description of chemical indexing in DE HAEN DRUG 
DATA: 

USAN generic names are preferred nomenclature 

in the DE HAEN database. Trade names are searchable 

in many DE HAEN records. If the source author refers 

to a drug by trade names, the online record will in- 

clude the names given. Chemical names, CAS registry 

numbers, and molecular formulas are indexed in many 

but not all...records. 
This illustrates the multiple ways a drug may be indexed within 


a single database and makes clear the difficulty of searching 


more than one database when there is no standard method of 


indexing drugs. 
In her chapter on “Online Database Selection," Snow 
presents a chart showing the "Preference Hierarchy for Pharma- 
ceutical Nomenclature in Selected Online Databases." This 
reinforces the differences between databases and the need for 
searching under a range of possible names for a particular drug. 
For this study, the term ‘drug' is useu to describe a 
pharmacologically active chemical or compound. Each drug entity 
can be described in a variety of ways including: 
- Chemical Name: In the United States this generally 


follows the American Chemical Society conventions 
for naming compounds. 


Molecular Formula: Describes a compound by atom count 
Ce.B., C1 6H5,N0,). 


- Chemical Abstracts Service (CAS) Registry Number: 
This is a unique identifying code for a substance. 


- Nonproprietary Name or Generic Name: This is a 
simplified chemical name. The terms ‘nonproprietary' 
and ‘generic'’'are commonly used interchangeably 
although the terms are not synonymous. (The generic 
name refers to a class of drugs while a nonproprie- 
tary name refers to a specific compound.) In the U.S., 
this term is assigned by the United States Adopted 
Names Council and is referred to as a U.S. Adopted 
Name (USAN). Other agencies may assign different 
names to the same substance and variants may include 
the British Approved Name (BAN) or the International 
Nonproprietary Name (INN) assigned by the World Health 
Organization. 


- Drug Jnvestigational Code or Research Number: A code 
or abbreviation assigned to new products under 
investigation for ease of reference and for security 
of a new drug discovery. More than one research code 
may be used for the same drug. 
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- Trade Name or Proprietary Name: This is usually as 
registered trademar’ of the manufacturer to identify 
a specific product formulation and indication. Many 
products have multiple trade names and even when 
manufactured by one company may have different trade 
names in different countries. 


3. METHODOLOGY 

This study uses a comparative methodology to evaluate 
the indexing of drugs within four online databases. A compa- 
rative ctudy will obtain comparable measures on different data- 
bases; that is, one searches the same terms in each of the four 
databases, tabulates the results, and examines the similarities 
or differences of the indexing of each database and its effect 
on retrieval performance. 

Thirty drug entities were chosen to study. Fifteen of 
the drugs chosen have been commercially available in the United 
States for at least ten years. Fifteen of the drugs have only 
been available within the last five years, some of them still 
in the investigational process. Each drug entity was searched 
against two dictionary files (CHEMNAME and THE MERCK INDEX ONLINE) 
for all associated names including trade or proprietary names, 
chemical names, synonyms or acronyms, investigational codes or 
research numbers, Chemical Abstracts Service Registry Numbers, 
and, generic or nonproprietary names. 

Next, each of the terms thus identified were searched 
in four online databases. The four databases selected for the 
study (Analytical Abstracts; BIOSIS PREVIEWS; Pharmaceutical 


News Index; and, SCISEARCH) offer broad coverage and subject 
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scope but none utilize controlled vocabulary indexing. Many 
alternate databases could have been selected for study. Since 
the drug information searcher is often required to find chemi- 
cal, clinical, business, or general information about a parti- 
cular drug, a database representing each type of search was 
selected. 

SCISEARCH indexes 4,500 journal titles from more than fifty 
countries and offers broad coverage of the general science and 
technology literature. For this study, DIALOG File 34 (covering 
1988 to the present) was studied. A journal issue is likely to 
be cited in SCISEARCH within two weeks of publication making it 
a good source of current information. Subject access is limited 
to title words and abstracts are not always included. Therefore, 
the searcher must use numerous alternate names to improve recall. 

BIOSIS PREVIEWS is international in scope and encomvasses 
research in the biological and biomedical sciences. For this 
study, DIALOG File 55 (covering 1985 to the present) was studied. 
About half of the citations are clinically oriented and the phar- 
maceutical search uses it to Locate information on drug develop- 
ment, toxicity, and pharmacology. The preferred terms in index- 
ing are the U.S. Adopted Names (USAN) but when mentioned in the 
original source, investigational codes and proprietary names 
are also included. 

Analytical Abstracts is a chemically-oriented database 
published by the Royal Society of Chemistry and covering more 


than 1,300 journal titles, books, reports, national standards, 
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and conference proceedings. Drug information searchers uti- 
lize Analytical Abstracts for topics involving biochemistry 
or medicinal chemistry. The preferred nomenclature is the 
chemical name and CAS registry number. 

The Pharmaceutical News [Index database covers current and 
retrospective business news related to the pharmaceutical indus- 
try. It indexes information on legislation and regulations, 
research and development, and market analysis. There is no 
controlled vocabulary and trade names, generic names, chemical 
names, and manufacturer code names may all be indexed. 

To compare actual drug indexing to stated indexing policies 
of the four databases, the search results for each drug were 
tabulated and graphically presented in terms of percentages of: 
CAS registry numbers indexed; nonproprietary names indexed; 
investigational codes indexed; and proprietary names indexed. 

Data was also tabulated and graphically presented showing 
the results of the following searches: 


A) One proprietary name vs. all proprietary names 
identified 


B) One nonproprietary name vs. all nonproprietary 
names identified 


C) All nonproprietary names identified vs. all 
nonproprietary names plus all proprietary names 


Comparisons of these results demonstrate whether recall 
is increased by searching all available nonproprietary names; 
all available proprietary names; or, a combination of all non- 


proprietary names plus all proprietary names. 
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Finally, the search results were tabulated and graphi- 
cally presented for each database (looking at the thirty drugs 
in aggregate) showing: percentage of nonproprietary terms in- 
dexed; percentage of proprietary terms indexed; percentage of 
CAS registry numbers indexed; and, percentage of investigational 


codes indexed. 
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4. RESULTS 
4.1 Introduction 
Thirty drug entities were first searched against two 
dictionary files to identify all associated terms. All of 
these terms were then searched in four online databases 
(Analytical Abstracts. BIOSIS PREVIEWS, Pharmaceutical News 
Index, and SCISEARCH) and results are discussed by database. 
Results for the thirty drug entities in aggregate (by type 
of term) are illustrated in Figure 1. The search results 
presented for each database include: 
- Graphs showing percentage of: 
CAS registry numbers indexed 
nonproprietary terms indexed 
investigational codes indexed 
proprietary terms indexed 
- Graphs showing comparison question results (i.e., 
one term vs. all terms) 
4.2 Analytical Abstracts 
The search results from Analytical Abstracts for percent- 
age of Chemical Abstracts Service (CAS) registry numbers in- 
dexed are shown in Figure 2. Because Analytical Abstracts is 
chemically-oriented, it is not surprising that the majority of 
terms (70% of total) were retrievable using the CAS registry 
numbers. Each of the terms which were not retrievable using 


the registry number were from the recent drug group, again a 


reflection of less currency or narrower scope of the database. 


Percent of 
total terms 
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Fig. 1. Percent of Total Terms Indexed for Thirty 
Drugs in Aggregate 
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Unfortunately, this was the only database of the four studied 
that utilized registry numbers. 

No proprietary terms retrieved any citations at all. con- 
sistent with the orientation specifically toward analytical 
chemistry. Investigational codes were not commonly search- 
able - only 13% of the total codes identified retrieved any ci- 
tations (see Figure 3). Again this reflects the nar ow subject 
scope and orientation to the needs of the analytical chemist. 

Most of the drug entities (97%) were retrievable using the 
current nonproprietary term (USAN) as shown in Figure 4. How- 
ever, very few of the synonyms retrieved any citations. This 
probably reflects the emphasis of the database on chemical names 
and CAS registry numbers. Two nonproprietary terms (cifenline 
and desflurane) retrieved no citations possibly reflecting the 
smaller scope of Analytical Abstracts or a less-current coverage 
as both of these drugs are of more recent discovery. The ‘ci- 
fenline' entity was retrievable using its previous 10nproprieta- 
ry term of ‘'cibenzoline.' Also, Analytical Abstracts reflects 
its British influence (it is produced by the Royal Society of 
Chemistry) in its use of British spelling conventions. Thus 
‘sulfasalazine’ retrieved no citations while ‘sulphasalazine' 
retrieved several. Likewise, ‘albuterol’ retrieved eleven cita- 
tions while the British term ‘salbutamol' retrieved seventy-one 
citations. 


Of the comparison questions. only question B can be analy- 
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zed (see Figure 5) because no proprietary terms were search- 
able in Analytical Abstracts. As discussed previously, in 
general the current nonproprietary term (USAN) retrieved the 
majority of citations and addition of secondary terms did not 
improve recall. The exceptions involved spelling variants and 


use of older terminology. 


%.3 SCISEARCH 

The SCISEARCH database offers broad, multidisciplinary 
subject coverage of the literature of science and technology. 
In addition, a journal issue is likely to be cited in SCI- 
SEARCH within two weeks of publication and sources are indexed 
cover-to-cover. However, author abstracts are searchable 
only in records added since January 1991 when available. All 
terms in the database are derived from the author's language 
and this is a noticeable factor in '\1ich terms retrieve cita- 
tions. 

Therefore it is not surprising to find multiple hits when 
searching cn nonproprietary names and alternate nonproprietary 
terms (see Figure 6). As in Analytical Abstracts, there is a 
significant increase in recall when using older or previous 
nonproprietary term names compared to the newer or current term 
(e.g., cifenline - 5 hits, cibenzoline - 65 hits; albuterol - 
273 hits, salbutamol - 952 hits). 

Curiously, there was very low recall (16 hits) on the term 


‘'methocarhbamol' and zero postings on any of its alternate terms, 
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including both nonproprietary and proprietary terms. Possibly, 
this reflects the fact that this drug has long been available 
and has not been studied in recent years. In contrast, the 
older drug ‘sulfasalazine’ had numerous postings both on that 
term specifically (634) and on its aliernate nonproprietary 
terms (867). This may indicate ongoing research on this entity. 

Only about a third of the investigational codes were in- 
dexed (see Figure 7). This is not surprising since authors 
would more like’y refer to a drug being studied by the nonpro- 
prietary term. An exception here are entities which are better 
or equally known by their investigational codes (e.g., mifepris- 
tone - 377 hits vs. RU “°6 - 692 hits). 

Most of the entities generated hits on at least one of the 
proprietary terms identified (see Figure 8). Therefore entities 
with only one proprietary term are likely to show 100% retrieval. 
But where multiple proprietary terms are available, very few 
additional postings were gained. Of a total of 184 proprietary 
terms identified, only 24% were retrieved in SCISEARCH. When 
comparing the total number of postings for each term (nonpro- 
prietary vs. proprietary) nonproprietary terms yielded the high- 
est recall by far with very few postings for any of the proprie- 
tary terms. 

Figure 9 illustrates the search results of comparison 
question A which compares one proprietary term to all proprie- 
tary terms. The percent increase in number of hits is mis- 


leading because of the low number of hits on any one term. 


acecainide 


azithromycin 
cefixime 
cifenline 
desflurane 
dilevalol 
halofantrine 
mifepristone 
nedocromil 
ondansetron 
oxaprozin 
paroxetine 
tacrine 
terbinafine 
viloxazine 
albuterol 
daclofen 
clobetasol 
etoposide 
indapamide 
mechlorethamine 
mefloquine 
methocarbamol 
methotrexate 
methylphenidate 
miconazole 
nifedipine 
prazosin 
ranitidine 


sulfasalazine 


20 +0 wo $0 wo 
Percent of terms indexed 


Fig. 7. SCISEARCH: Percent of Investigational Codes 
Indexed 


=SG< 


acecainide N/A 


azithromycin 
cefixime 
cifenline 
desflurane Nik 
dilevalol 
halofantrine 
mifepristone 
nedocromil 
ondansetron 
oxaprozin 
paroxetine 
tacrine ——- + 
terbinafine 
viloxazine 
albuterol 
baclofen 
clobetasol 
etoposide 
indapamide 
mechlorethamine 
mez Lloquine 
methocarbamol 
methotrexate 
methylphenidate 
miconazole 
nifedipine 
prazosin 
ranitidine 


sulfasalazine 
20 0 ve 6 i100 
Percent of terms indexed 


SCISEARCH: Percent of Proprietary Terms 


Indexed 


Pi Ae ee 


acecairide NIA 
azithromycin 


cefixime 


cifenline 
desflurane WA 
cilevalol 
halofantrine 
mifepristone WA 
nedocromil nia 
ondansetron 
oxaprozin 
paroxetine 
tacrine 
terbinafine N/A 
viloxazine 
albuterol 
baclofen 
clobetasol 
etoposide 
indapamide 
mechlorethamine 
mefloquine 
methocarbamol 
methotrexate 
methylpvhenidate 
miconazole 
nifedipine 
prazosin 
ranitidine 
sulfasalazine 
20 a) o €0 100 
Percent increase in number of hits 


Fig. 9. SCISEARCH: One Proprietary Term vs. 
All Proprietary Terms 


34 


-28- 
Thus, picking up only a few additional hits may appear as a 
100% or greater increase in hits. 

In question B which compares one nonproprietary term to 
all nonproprietary terms, a significant increase in recall is 
observed (see Figure 10). This is particularly true for non- 
proprietary terms with older synonyms where a search on all 
identified nonproprietary terms is necessary for more complete 
recall (e.g., mechlorethamine - 277 hits vs. mechlorethamine + 
synonyms - 544 hits). However, this is not always the case 
because even some older terms show little or no increase in 
recall when combined with other synonyms (e.g., methylphenidate - 
308 hits vs. methylphenidate + synonyms - 308 hits). 

As discussed earlier, the result of searching on all non- 
proprietary terms identified yields the greatest number of 
postings. In general, the addition of proprietary terms does 


not significantly increase recall (see Figure 11). 


4.4. BIOSIS PREVIEWS 

The results from BIOSIS PREVIEWS are very similar to the 
search results in SCISEARCH including highest number of postings 
for the nonproprietary term; higher postings for older synonyms; 
and, relatively few postings for proprietary terms. An inter- 
esting exception occurs with two terms where acronyms yield 
significantly greater hits than the nonproprietary term (ta- 
crine - 183 hits vs. THA - 1216 hits; acecainide - 12 hits vs. 


NAPA - 122 hits). This is consistent with the coverage of BIO- 
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STS PREVIEWS of original research and availability of abstracts 
in all records added since July 1976. Therefore, the more com- 
monly known acronyms are likely to be included as terms. 

Also noticeable was the large number of postings per term 
(particularly for the nonproprietary term) compared to the other 
databases. This may reflect broader scope and/or coverage or 
the value of abstracts being available to search because the 
total number of records in BIOS:S PREVIEWS is similar to the 
number in SCISEARCH. 

BIOSIS PREVIEWS includes more of the investigational codes 
than the other databases (47% yielded postings) (see Figure 12). 
Again, because it covers original research and includes abstracts 
it is more likely that investigational codes will be indexed. 

As in SCISEARCH, proprietary terms generally do not result 
in many hits (see Figure 13) especially compared to nonproprie- 
tary terms (see Figure 14). However BIOSIS PREVIEWS indexed 
32% of the total number identified which is slightly greater 
than the total indexed by SCISEARCH (24%). 

The comparison question results once again are not signi- 
ficantly different than SCISEARCH results. Small numbers of 
postings with an additional term may appear to greatly increase 
recall (e.g.. 1 hits vs. 2 hits shows a 100% increase). In 
general, a search on multiple proprietary terms will add cita- 
tions but the increase in total number of hits will not be 


great (see Figure 15). 
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Questions B and C results are similar to SCISEARCH results. 
That is, searching on multiple nonproprietary terms can signi- 
ficantly increase recall although not in every case (see Figure 
16). And, adding all proprietary terms to all nonproprietary 


terms does not significantly improve recall (see Figure 17). 


4.5 Pharmaceutical News Index 

Pharmaceutical News Index (PNI) focuses on international 
pharmaceutical business information. PNI often adds synonyms, 
proprietary names, acronyms, and abbreviations as descriptors. 
For this reason, the searcher may benefit greatly by identi- 
fying and searching as many terms as are available for greatest 
recall. 

Each of the primary nonproprietary terms was indexed (see 
Figure 18) which is consistent with the coverage of this data- 
base. Once again, there were a greater number of postings for 
the older terms (e.g., ‘cifenline' vs. ‘cibenzoline') and for 
acronyms (e.g... tacrine vs. THA and acecainide vs. NAPA) al- 
though not as great an increase as in SCISEARCH or BIOSIS PRE- 
VIEWS. 

About 37% of the total identified codes resulted in post- 
ings (see Figure 19). This is consistent with the database 
coverage of such areas as R & D in progress, pharmaceutical 
research, and New Drug Application (NDA) approvals. 

PNI significantly differed from the other databases stu- 


died in the coverage of proprietary terms (see Figure 20). 
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Of the 184 terms identified, 58% of them were indexed. Again, 
this is consistent with the business focus of the database 

and its coverage of advertising campaigns, market analyses, 

and prescription markets where proprietary terms would predomi- 
nate. <Aiso, the number of postings per proprietary term was 
significantly greater than the number in the two previous 
databases. Therefore in this database it is useful to identify 
multiple terms prior to searching as no one type of term pre- 
dominates (e.g., nonpropriretary term). 

The difference in coverage of proprietary terms is notice- 
able in the results of the comparison questions. In question A, 
which compares one proprietary term to all proprietary terms, lar- 
ger increases in recall are observed (see Figure 21). Similarly, 
in question B, moderate increases in recall are observed when 
searching all identified nonproprietary terms vs. a single term 
(see Figure 22). 

The most notable change is in question C which compares 
recall for all known nonproprietary terms vs. all nonproprietary 
terms plus all proprietary terms (see Figure 23). Because recall 
among proprietary terms is greater (in general) in PNI, it is not 
surprising that searching on all nonproprietary terms plus pro- 
prietary terms significantly increases recall. These search 
results underscore the importance of searching multiple terms 


in this database. 


be 


acecairidge 


azithromycin 1 
eefixime ee ee ee 


cirenline 


desflurane 

Cilevalol a a Pe EE TE EES 

halofantrine We 

mifepristone 

nedocremil te 

ondansetron 

oxaprozin 

paroxetine a ay ee 

tacrine 

terbinafine N/O 

viloxazine 

alduterol ae ee ee Ee ws ee ee ee eo 

baclofen 

clobetasol eee ee ee 

etoposide ieee 

indapamide oe Ee 

mechlorethamine 

mefloquine ee a 

methocarbamol 

methotrexate 

methylohenidate 

miconazole 

nifedipine ii ee et Ae a 

prazosin 

ranitidine fee eer 

sulfasalazine 2 ant VSS RI IRR AE Ne OTST 
20 4o  ~— oO §O Tee) 


Percent increase in number of hits 


Fig. 21. Pharmaceutical News Index: One Proprietary 
Term vs. All Proprietary Terms 


o0) 


acecairide 
azithromycin 
cefixime 
cifenline 
desflurane 
cilevalol 
halofantrine 
mifepristone 
nedocremil 
ondansetron 
oxaprozin 
paroxetine 
tacrine 
terbinafine 
viloxazine 
albuterol 
baclofen 
clobetasol 
etoposide 
indapamide 
mechlorethamine 
mefloquine 
methocarbamol 
methotrexate 
methylphenidate 
miconazole 
nifedipine 
prazosin 
ranitidine 


sulfasalazine 


Fig. 22. 


NIA 


-44- 


20 4O Go §0 
Percent increase in number of hits 


Pharmaceutical News Index: One Nonproprietary 


Term vs. 


All Nonproprietary Terms 


ol 


100 


aiSa 


= 


/A 


acecairidge 


azithromycin 
cefixime 
cifenline 
desflurane 
Gilevalol 
halofantrine 
mifepristone 
nedocrcmil 
ondansetron 
oxaprozin 
paroxetine 
tacrine 
terbinafine 
viloxazine 
albuterol 
baclofen 
clobetasol 
etoposide 
indapamide 
mechlorethamine 
mef loquine 


methocarbamol 


methotrexate 
methylohenidate 
miconazole 
nifedipine 


prazosin 


ranitidine 
sulfasalazine 
20 4O WO SO 


Percent increase in number of hits 


Fig. 23. Pharmaceutical News Index: All Nonproprietary 
Terms vs. All Nonproprietary Terms + All Proprietary Terms 


& 


o2 


-h6- 


5. CONCLUSIONS 

‘Drug information’ is a very broad term that often draws 
on the literature of chemistry, various medical specialties, 
related sciences such as biochemistry and microbiology, and 
dusiness events of the pharmaceutical industry. No single re- 
source will contain all types of drug information and there is 
no one standard form of indexing ‘drug information.' This 
study looked closely at the drug indexing of four distinctly 
different online databases, none of which uses controlled in- 
dexing. 

Results showed the highest recall for each of the data- 
bases to be nonproprietary terms where recall ranged from 62% 
to 84% of the total nonproprietary terms identified. However 
use of the one current nonproprietary term does not consistent- 
ly provide the best recall. In several cases, an older syno- 
nym or more commonly used acronym retrieved more postings than 
the current accepted term. Also, differences in recall were 
observed when searching the British produced database where 
the British spellings predominate and must be considered. 

Analytical Abstracts indexes CAS Registry Numbers and a 
search on RNs yielded 70% retrieval. Combining nonproprietary 
terms with RNs yields very high recall. Of the thirty entities 
studied, this strategy retrieved information on twenty-nine of 


them. The RN offers a unique identifying code for each sub- 
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stance and is particularly useful in a chemically-oriented data- 
base such as this. However, the same drug can he assigned more 

than one RN devending on how it is described by the author. 

Thus the searcher must consider current RNs as well as previous 

RNs. 

SCISEARCH and BIOSTS PREVIEWS had similar profiles of search 
results with the highest recall on nonproprietary terms (78% and 
84%) and some recall on proprietary terms (24% and 32%). SCI- 
SEARCH derives its terms ftom the authot's’ title and abstract 
(since January 1991 for some records). Thus, drug names appear 
in the form used in the original title (Cor author abstract when 
available) whether spelled out, hyphenated, nonproprietary. acro- 
nym. or proprietary. The complexities of "natural language" 
must be considered when searching this database. 

Searching investigational codes yielded the highest recall 
(47%) in BIOSIS PREVIEWS and therefore is a useful concept to 
consider in building a search strategy for this database. As in 
SCISEARCH, searching on multiple nonproprietary terms can signi- 
ficantly increase recall (although not always) and adding pro- 
prietary terms does not significantly improve recall. Also noted 
was a significant increase in number of hits per term, possibly 
due to the availability of abstracts. The increased number of 
hits per term may reflect the broad coverage of all life sciences 
and the Large size of the database. It would be useful to exa- 


Mine the retrieved citations and evaluate precision which would 
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require a more defined search question than was used in this 
study. 

PNI also yielded high recall on nonproprietary search 
terms (75% of total) and as in SCISEARCH and BIOSIS PREVIEWS, 
there were more postings for older nonproprietary terms and 
acronyms, - PNI covered significantly more proprietary terms 
(58%) the highest of any of the four databases. Thus it is use- 
ful to consider all proprietavy terms when searching PNI. A 
combination search strategy using multiple nonproprietary and 
proprietary terms yields the highest recall in PNI. 

This study looked at only four representative databases 
from among hundreds. Likewise, this study used two dictionary 
files for term identification, where many other files could 
have been used. Also, only thirty drug entities (chosen arbi- 
trarily) were studied from among thousands of drug entities. 
Still. some general conclusions can be drawn from this study. 

First, the study underscores the need for identification 


and use of variant terminology when searching for information 


on drug entities. No single term can be relied na to retrieve 
complete information about a particular drug. The searcher must 
consider many things including: 


- orientation of the database to be searched 


whether indexing is added (via descriptors or abstracts) 
or relies on author language for its terms 


- where the database is produced (e.g., Great Britain vs, 


- preferred nomenclature of the database to be searched 
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Secondly, the study revealed some potential pitfalls, One 
such problem to consider is possible errors in the databases. 
@ither in the dictionary files or the bibliographic files, In 
this study, one of the proprietary terms identified in the 
Marck Index was incorrectly spelled ‘zantic' which should be 
‘zantac.' Without knowledge of this, many citations would be 
missed particularly since ‘zantac’ is the name of the product 
as distributed in the United States. 

Another problem involves false drops when searching cer- 
tain acronyms (which may also be common abbreviations for other 
terms) and certain proprietary names which are ambiguous. For 
example. one of the proprietary terms identified for ‘raniti- 
dine' was ‘trigger' and this term retrieved primarily false 
drops such as "FDA Supplemental Appropriations Bill to ‘Trigger’ 
User Fees" and "Versatile 'Trigger' and Time-Delay Generator 
for Laser-Enhanced Time-of-Flight Mass Spectrometry." For this 
reason, the term ‘trigger’ was eliminated from this study. Simi- 
larly, the disproportionately higher number of postings for 
‘THA' in BIOSIS PREVIEWS might indicate several false drops. 

This study showed that use of the nonproprietary term and 
its synonyms or acronyms yields the highest recall and at mini- 
mum the searcher must identify alternate nonproprietary terms. 
To improve recall, it is useful to identify other types of terms 
for searching depending on the database being searched. For ex- 


ample, use of CAS RNs in Analytical Abstracts or proprietary 
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terms in PNI together with nonproprietary terms improves recall. 
It is important to also consider that a term which yields few 
postings may yield the best information result and therefore 
looking only at quantative results does not indicate anything 
about the quality of results. 

Future research might examine four or more databases of 
the same orientation (e.g., four pharmaceutical business data- 
bases) to compare indexing of drugs. It would also be useful 
to look at more than thirty drugs and/or do a more in-depth 
comparison of search results for newer drugs vs. older drugs. 

In conclusion. searchers seeking information involving 
drug entities must begin with a basic understanding of the 
diverse nomenclature used in the pharmaceutical literature. 
The searcher must effectively utilize chemical dictionary files 
to determine likely alternate terms and must consider the ori- 
entation and indexing policies of the databases to be searched. 
Finally, the searcher would be well-served to plan the search 
strategy prior to going online but to observe closely initial 
search results and make chanses in search stratery if necessary. 
The complexities of drug nomenclature makes searching for drug 
information somewhat of an art and the searcher must be creative 


when seeking such information. 


e& 
ao 


Newer Drugs 


acecainide 
azithromycin 
cefixime 
cifenline 
desflurane 
dilevalol 
halofantrine 
mifepristone 
nedocromil 
ondansetron 
oxaprozin 
paroxetine 
tacrine 
terbinifine 


viloxazine 


6. APPENDIX 


List of Drugs Studied 
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Older Drugs 


albuterol 
baclofen 
clobetasol 
etoposide 
indapamide 
mechlorethamine 
mefloquine 
methocarbamol 
methotrexate 
methylphenidate 
miconazole 
nifedipine 
prazosin 
ranitidine 


sulfasalazine 
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